In this section we recall some basic results about Gr\"obner bases. For a more detailed treatment, we refer the reader to, for instance, \cite{Cox1992}. Consider a polynomial ring $R = \F[x_0,\dots,x_{n-1}]$ over some finite field $\F$. We adopt some admissible ordering on monomials in $R$. We can then denote by $\LM{f}$ the largest or leading monomial appearing in $f \in R$ and by $\LC{f} \in \F$ the coefficient corresponding to $\LM{f}$ in $f$. By $\LT{f}$ we denote $\LC{f} \cdot \LM{f}$. In this work $\LV{f}$ denotes the largest variable -- ordered w.r.t. the monomial ordering -- in the leading monomial $\LM{f}$ of $f$, 
and given a set $F \subset R$, we define
$\LV{F,x}$ as $\{f \in F \mid \LV{f} = x\}$. 
The set of leading monomials of $F$ is defined as $\LM{F} = \{\LM{f} \mid f \in F\}$, $M$ denotes the set of all monomials in $R$, 
while $M(F)$ is the set of all monomials appearing in the polynomials in $F$. 

The ideal $\mathcal{I}$ generated by \sys $\in R$, denoted $\ideal{\sys}$, is defined 
as $$\left\{ \sum_{i=0}^{m-1} h_i f_i  \mid h_0 ,\dots , h_{m-1} \in R \right\}.$$ It is well-known that every ideal  $\mathcal{I} \subseteq R$ is finitely generated.
A Gr\"obner basis of an ideal $\mathcal{I}$ is a particular set of generators. 
\begin{definition}[Gr\"obner Basis]
Let $\mathcal{I}$ be an ideal of $\F[x_0,\dots,x_{n-1}]$ and fix a monomial ordering. A finite subset $$G = \{g_0 ,\dots , g_{m-1} \} \subset \mathcal{I}$$  is said to be a \emph{Gr\"obner basis} of $\mathcal{I}$ if for any $f \in \mathcal{I}$ there exists $g_i \in G$ such that $\LM{g_i} \mid \LM{f}$.
\end{definition}
We note that if a set of polynomials $\sys$ has a unique root, i.e. the system of equations $f_0=0,\ldots,f_{m-1}=0$ has a unique solution, then computation of the Gr\"obner basis of the corresponding
ideal allows one to solve the system (i.e. the solution can be ``read" directly on the Gr\"obner basis).
More generally, if the ideal is zero-dimensional, the solutions of a system can be computed from a Gr\"obner basis in polynomial-time (in the number of solutions) \cite{FGLM}.
  
Since the notion of Gr\"obner bases is defined by the existence of {\it relatively} low leading terms, the task of computing a Gr\"obner basis is essentially to find new elements in the ideal with lower leading terms until no more such elements can be found. Buchberger proved in his PhD thesis~\cite{Buchberger65} that Gr\"obner bases can be computed by considering only S-polynomials. Such polynomials are designed to cancel leading terms and thus potentially produce new elements in the ideal with lower leading terms.
\begin{definition}[S-Polynomial]
\label{def:spolynomials}
Let $f,g\ \in\ \F[x_0,\dots,x_{n-1}]$ be non-zero polynomials.
\begin{itemize}
\item Let $\LM{f} = \prod_{i=0}^{n-1} x_i^{\alpha_i}$ and $\LM{g} = \prod_{i=0}^{n-1} x_i^{\beta_i}$, with $\alpha_i,\beta_i \in \NN$, denote the leading monomials
of $f$ and $g$ respectively.
Set $\gamma_i = \max(\alpha_i,\beta_i)$ for every $0 \leq  i < n$, and denote by $x^\gamma\ = \prod_{i=0}^{n-1} x_i^{\gamma_i}$.  
It holds that $x^\gamma$ is the least common multiple of
$\LM{f}$ and $\LM{g}$, written as $$x^\gamma\ =\LCM{\LM{f},\LM{g}}.$$  
\item The {\it S-polynomial} of $f$ and $g$ is defined as
$$
S(f,g)\ =\ \frac{x^\gamma}{\textsc{LT}(f)}\cdot f\ -\
\frac{x^\gamma}{\textsc{LT}(g)}\cdot g.
$$
\end{itemize}
\end{definition}

Now let $G=\{g_0,\ldots, g_{s-1}\} \subset R$, and $\mathcal{I}$ be the ideal generated by $G$.
We say that a polynomial $f \in \mathcal{I}$ has a {\it standard representation} w.r.t. $G$ if 
there exist constants $a_0,\ldots, a_{s-1} \in \F$ and monomials $t_0,\ldots, t_{s-1} \in M$ such that     
$$
f=\sum_{k=0}^{s-1}a_k t_k g_k, 
$$    
with $\LM{t_k g_k} \leq \LM{f}$. Buchberger's main result stated that $G$ is a Gr\"obner basis for $\mathcal{I}$ if and only if every 
S-polynomial $S(g_i,g_j)$ has a  {\it standard representation} w.r.t. $G$.

Furthermore, Buchberger showed that in the computation of Gr\"obner bases it is \emph{sufficient} to consider S-poly\-nomials only, since \emph{any }reduction of leading terms can be attributed to S-polynomials. 
There are many variants of this result in textbooks on commutative algebra; we give below the statement and proof based on \cite{Cox1992} since the presentation helps to understand the close connection between 
XL and Gr\"obner basis algorithms.
The proof is included for the sake of completeness.
\input{lemma1}   
The following corollary is a simple generalisation of Lemma~\ref{lemma:cancel} to sums where not all summands have the same leading term.
\begin{corollary}
\label{corollary:cancel_general}
Let $f_0,\ldots,f_{t-1}$ be polynomials in $R$. 
Consider the polynomial $f$ as the sum $f = \Sigma^{t-1}_{i=0} c_ix^{\alpha(i)}f_i$, with coefficients 
$c_0, \ldots, c_{t-1} \in \mathbb{F}\backslash\{0\}$,
such that $\LM{f} < x^\delta = \max\{x^{\alpha(i)}\LM{f_i}\}$. 
Without loss of generality, we can assume that there is a $\tilde{t}$ such that $x^{\alpha(j)}\LM{f_j} = x^\delta$ for $j < \tilde{t}$ and $x^{\alpha(k)}\LM{f_k} < x^\delta$ for $k \geq \tilde{t}$.  
Then there exist constants $b_{i} \in \F$ such that
\begin{eqnarray*}
f &=& \sum_{i=0}^{\tilde{t}-2} b_{i}x^{\delta-\tau_{i}}S(f_i,f_{i+1}) + \sum_{k=\tilde{t}}^{t-1} c_kx^{\alpha(k)}f_k\\
 &=& \sum \tilde{c_i}x^{\tilde{\alpha}(i)}\tilde{f}_i,
\end{eqnarray*}
where \(x^{\tau_{j}} = \LCM{\LM{f_{j}},\LM{f_{j+1}}}\), $\tilde{c_i} x^{\tilde{\alpha}(i)} \tilde{f_i} = c_{i+1} x^{\alpha(i+1)} f_{i+1}$ if $i\geq \tilde{t}-1$ and $b_{i}x^{\delta-\tau_{i}}S(f_i,f_{i+1})$ otherwise. Furthermore, for all $0 \leq i \leq \tilde{t}-2$, we have $$\LM{x^{\delta-\tau_{i}}S(f_i,f_{i+1})} < x^\delta$$ and thus 
$$
x^{\tilde{\alpha}(i)}\LM{\tilde{f}_i} < x^\delta \mbox{  for all } i.
$$
\end{corollary}

Corollary~\ref{corollary:cancel_general} states essentially that whatever cancellations can be produced by monomial multiplications and $\F$-linear combinations, they can be attributed to S-polynomials. It follows that 
the only cancellations that need to be considered in an XL-style algorithm are those produced by S-polynomials. 

\begin{example}
Consider the polynomials $f =xy + x + 1$, $g = x + 1$ and $h = z + 1 \in \F_{127}[x,y,z]$, a pathological example constructed to demonstrate the role of S-polynomials. We fix the degree reverse lexicographical term ordering. To compute a Gr\"obner basis, we start by constructing two S-polynomials of degree two, namely: $f - y\cdot g = x - y + 1$ and $z\cdot g - x\cdot h = -x + z$. We note that the latter trivially reduces to zero and would be detected and avoided by Buchberger's first criterion \cite{Cox1992}. Ignoring this optimisation, in matrix notation, we would have to consider the six rows corresponding to $f, y\cdot g, z\cdot g, x\cdot h, g$ and $h$.
For comparison, XL would consider the following polynomials up to degree two.
\begin{center}
\begin{tabular}{rrrrrr}
$f  =$ & $xy + x + 1$, & \ \ $x\cdot g$ = & $x^2 + x$, & \ \ $y\cdot g =$ & $xy + y$,\\
$z\cdot g$ = & $xz + z$,& \ \   $x\cdot h =$ & $xz + x$,    & \ \ $y\cdot h$ = & $yz + y$,\\
$z\cdot h =$ & $z^2 + z$,    & \ \ $g $ = & $x + 1$, & \ \ $h  =$ & $z + 1$.\\
\end{tabular}
\end{center}

In matrix notation we have
\begin{scriptsize}
\[
A = \left(\begin{array}{rrrrrrrrr}
0 & 1 & 0 & 0 & 0 & 1 & 0 & 0 & 1 \\
1 & 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 \\
0 & 1 & 0 & 0 & 0 & 0 & 1 & 0 & 0 \\
0 & 0 & 1 & 0 & 0 & 0 & 0 & 1 & 0 \\
0 & 0 & 1 & 0 & 0 & 1 & 0 & 0 & 0 \\
0 & 0 & 0 & 1 & 0 & 0 & 1 & 0 & 0 \\
0 & 0 & 0 & 0 & 1 & 0 & 0 & 1 & 0 \\
0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 & 1 \\
0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 1
\end{array}\right) \mbox{  and  }
E = \left(\begin{array}{rrrrrrrrr}
1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & -1 \\
0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\
0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 & -1 \\
0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 \\
0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 & -1 \\
0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 & 1 \\
0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 \\
0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 1 \\
0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0
\end{array}\right).\]
\end{scriptsize}
Of course, the system $f=0,g=0,h=0$ is straightforwardly solved by evaluating $f$ at $x= -1$ as implied by $g$. We note, however, that this computation is equivalent to reducing the S-polynomial $S(f,g)$ by $g$, i.e., the first step in Buchberger's algorithm.
\end{example}
Note that Lemma~\ref{lemma:cancel} does not state that $\LM{f} = \max\{\LM{S(f_j,f_{j+1})}\}$, but rather that the leading terms of summands decrease once rewritten using S-polynomials. In the following example, we consider the case when $\LM{f} < \max\{\LM{S(f_j,f_{j+1})}\}$. In this case, we can reapply Lemma~\ref{lemma:cancel} to $f_i' = S(f_i,f_j)$ as the following example emphasizes.

\begin{example} Consider the polynomials $f = xy + a$, $g = yz + b$, and $h = ab + 1$ in the polynomial ring $\F_{127}[x,y,z,a,b]$. We consider the degree reverse lexicographical term ordering.
There are three possible S-polynomials $S(f,g),S(f,h)$ and $S(g,h)$. Two of them -- $S(f,h)$ and $S(g,h)$ -- trivially reduce to zero and would be detected and avoided by Buchberger's first criterion. However, one S-polynomial does not reduce to zero: $s_0 = z\cdot f - x\cdot g = za - xb$.
From $s_0$ we can then construct $s_1 = b\cdot s_0 - z\cdot h = -xb^2 - z$, among others, also at degree 3, which is an element of the reduced Gr\"obner basis. The XL algorithm at degree 3 will produce $$\{m\cdot p \mid m \in \{1,x,y,z,a,b\}, p \in \{f,g,h\}\},$$ which reduces to 
\[
\begin{array}{llll}
 x^{2} y + x a, &\ x y^{2} + y a, &\  x y z + x b,  &\  y^{2} z + y b, \\ 
 y z^{2} + z b, &\ x y a + a^{2}, &\  y z a - 1,    &\  x y b - 1,     \\
 y z b + b^{2}, &\ x a b + x,     &\  y a b + y,    &\  z a b + z,     \\
 a^{2} b + a,   &\ a b^{2} + b,   &\  x y + a,      &\  y z + b,       \\
 z a - x b,     & \textnormal{ and } & a b + 1 &\\
\end{array}
\]
by Gaussian elimination. Note that $xb^2 + z$ is not in that list. However, if we increase the degree of XL to 4, the list returned is 
\[
\begin{array}{llll}

 x^{3} y + x^{2} a, &\ x^{2} y^{2} - a^{2}, &\ x y^{3} + y^{2} a, &\ x^{2} y z + x^{2} b,\\
 x y^{2} z + 1,     &\ y^{3} z + y^{2} b,   &\ x y z^{2} + x z b, &\ y^{2} z^{2} - b^{2},\\
 y z^{3} + z^{2} b, &\ x^{2} y a + x a^{2}, &\ x y^{2} a + y a^{2},&\ x y z a - x,\\
 y^{2} z a - y,     &\ y z^{2} a - z,       &\ x y a^{2} + a^{3}, &\ y z a^{2} - a,\\
 x^{2} y b - x,     &\ x y^{2} b - y,       &\ x y z b - z,  &\ y^{2} z b + y b^{2},\\
 y z^{2} b + z b^{2}, &\ x^{2} a b + x^{2}, &\ x y a b - a, &\ y^{2} a b + y^{2},\\
 x z a b + x z. &\ y z a b - b,             &\ z^{2} a b + z^{2},&\ x a^{2} b + x a,\\
 y a^{2} b + y a, &\  z a^{2} b + x b, &\  a^{3} b + a^{2}, &\  x y b^{2} - b,\\
 y z b^{2} + b^{3}, &\  x a b^{2} + x b, &\  y a b^{2} + y b, &\  z a b^{2} + z b, \\ 
 a^{2} b^{2} - 1, &\  a b^{3} + b^{2}, &\  x^{2} y + x a, &\  x y^{2} + y a, \\ 
 x y z + x b, &\  y^{2} z + y b, &\  y z^{2} + z b, &\  x y a + a^{2}, \\ 
 x z a - x^{2} b, &\  y z a - 1, &\  z^{2} a - x z b, &\  z a^{2} + x, \\ 
 x y b - 1, &\  y z b + b^{2}, &\  x a b + x, &\  y a b + y, \\ 
 z a b + z, &\  a^{2} b + a, &\  {\bf x b^{2} + z}, \ & a b^{2} + b, \\ 
 x y + a, &\  y z + b, &\  z a - x b & \textnormal{ and }  a b + 1 ,\\
\end{array}
\]
which does contain $xb^{2} + z$. Thus, XL did produce $x b^{2} + z$ in one step at degree $4$ but it could not produce $xb^2 + z$ at degree 3 since this element corresponds to $$b\cdot (z\cdot f - x\cdot g) - z\cdot h = (bz)\cdot f -(bx)\cdot g - z\cdot h,$$ but we have that $\deg(bz\cdot f) = 4$. We note that this behaviour of XL was the motivation for the Mutant concept.
\end{example}
